Goto

Collaborating Authors

 Darmstadt


Benchmarking the Attribution Quality of Vision Models Robin Hesse 1 Simone Schaub-Meyer 1,2 Stefan Roth Department of Computer Science, Technical University of Darmstadt

Neural Information Processing Systems

Attribution maps are one of the most established tools to explain the functioning of computer vision models. They assign importance scores to input features, indicating how relevant each feature is for the prediction of a deep neural network. While much research has gone into proposing new attribution methods, their proper evaluation remains a difficult challenge. In this work, we propose a novel evaluation protocol that overcomes two fundamental limitations of the widely used incremental-deletion protocol, i.e., the out-of-domain issue and lacking inter-model comparisons. This allows us to evaluate 23 attribution methods and how different design choices of popular vision backbones affect their attribution quality. We find that intrinsically explainable models outperform standard models and that raw attribution values exhibit a higher attribution quality than what is known from previous work. Further, we show consistent changes in the attribution quality when varying the network design, indicating that some standard design choices promote attribution quality.


Dense Unsupervised Learning for Video Segmentation Nikita Araslanov Simone Schaub-Meyer 1 Stefan Roth Department of Computer Science, TU Darmstadt

Neural Information Processing Systems

We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime. We rely on uniform grid sampling to extract a set of anchors and train our model to disambiguate between them on both inter-and intra-video levels. However, a naive scheme to train such a model results in a degenerate solution. We propose to prevent this with a simple regularisation scheme, accommodating the equivariance property of the segmentation task to similarity transformations. Our training objective admits efficient implementation and exhibits fast training convergence. On established VOS benchmarks, our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.


WikiDBs: A Large-Scale Corpus of Relational Databases from Wikidata Technical University of Darmstadt, Germany

Neural Information Processing Systems

Deep learning on tabular data, and particularly tabular representation learning, has recently gained growing interest. However, representation learning for relational databases with multiple tables is still an underexplored area, which may be attributed to the lack of openly available resources. To support the development of foundation models for tabular data and relational databases, we introduce WikiDBs, a novel open-source corpus of 100,000 relational databases. Each database consists of multiple tables connected by foreign keys. The corpus is based on Wikidata and aims to follow certain characteristics of real-world databases. In this paper, we describe the dataset and our method for creating it. By making our code publicly available, we enable others to create tailored versions of the dataset, for example, by creating databases in different languages. Finally, we conduct a set of initial experiments to showcase how WikiDBs can be used to train for data engineering tasks, such as missing value imputation and column type annotation.


Dense Unsupervised Learning for Video Segmentation Nikita Araslanov Simone Schaub-Meyer 1 Stefan Roth Department of Computer Science, TU Darmstadt

Neural Information Processing Systems

We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime. We rely on uniform grid sampling to extract a set of anchors and train our model to disambiguate between them on both inter-and intra-video levels. However, a naive scheme to train such a model results in a degenerate solution. We propose to prevent this with a simple regularisation scheme, accommodating the equivariance property of the segmentation task to similarity transformations. Our training objective admits efficient implementation and exhibits fast training convergence. On established VOS benchmarks, our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.


Catching heuristics are optimal control policies Boris Belousov, Jan Peters Department of Computer Science, TU Darmstadt

Neural Information Processing Systems

Two seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computational solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher's policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control.


AI boosting satellite navigation capability, European Space Agency says

FOX News

The European Space Agency said Thursday that it is using artificial intelligence for satellite navigation. The engineering teams of the agency's NAVISP program are working with European industry and academics to "invent the future of navigation," it said, resulting in a growing portfolio of services to improve space and Earth weather forecasting, enhance autonomous car and boat performance and help to identify rogue drones in sensitive airspace. The program aims to improve "satnav" performance by combining Global Navigation Satellite Systems (GNSS) with AI. "AI comprises all techniques enabling computers to mimic intelligence, whether they be data analysis systems or the embedded intelligence overseeing an autonomous vehicle," Rafael Lucas Rodriguez, the head of the NAVISP Technical Programme Office, said in a statement. A picture taken on February 7, 2020, shows the logo of the European Space Agency (ESA) at its European Space Operations Centre (ESOC) in Darmstadt, western Germany. "What AI is very good at, through so-called Machine Learning, ML, is extracting meaningful information to identify useful patterns that would otherwise have gone unseen. Satellite navigation is among the fields yielding large amounts of data, so within our sector AI could also serve as the basis of novel approaches and services," he noted.


AIhub coffee corner: AI risks, pause letters and the ensuing discourse

AIHub

This month, in light of the recent prominent discussions relating to perceived AI risks, we consider the pause letters and risk statements, the debate around existential threats, and how this discourse could impact the field and public perceptions. Joining the discussion this time are: Sanmay Das (George Mason University), Tom Dietterich (Oregon State University), Sabine Hauert (University of Bristol), Sarit Kraus (Bar-Ilan University), Anna Tahovská (Czech Technical University), and Oskar von Stryk (Technische Universität Darmstadt). Sabine Hauert: In today's discussion we're going to talk about potential AI risks and the recent discourse around existential threats. Does anyone have any hot reactions? How do you feel about the discourse of existential threat? Tom Dietterich: I agree with Emily Bender and a lot of the critics that it's a distraction and a diversion from thinking about the more immediate threats.


SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity

arXiv.org Artificial Intelligence

Recently, numerous studies have been conducted on supervised learning-based image denoising methods. However, these methods rely on large-scale noisy-clean image pairs, which are difficult to obtain in practice. Denoising methods with self-supervised training that can be trained with only noisy images have been proposed to address the limitation. These methods are based on the convolutional neural network (CNN) and have shown promising performance. However, CNN-based methods do not consider using nonlocal self-similarities essential in the traditional method, which can cause performance limitations. This paper presents self-similarity attention (SS-Attention), a novel self-attention module that can capture nonlocal self-similarities to solve the problem. We focus on designing a lightweight self-attention module in a pixel-wise manner, which is nearly impossible to implement using the classic self-attention module due to the quadratically increasing complexity with spatial resolution. Furthermore, we integrate SS-Attention into the blind-spot network called self-similarity-based blind-spot network (SS-BSN). We conduct the experiments on real-world image denoising tasks. The proposed method quantitatively and qualitatively outperforms state-of-the-art methods in self-supervised denoising on the Smartphone Image Denoising Dataset (SIDD) and Darmstadt Noise Dataset (DND) benchmark datasets.


A Flexible Framework for Virtual Omnidirectional Vision to Improve Operator Situation Awareness

arXiv.org Artificial Intelligence

During teleoperation of a mobile robot, providing good operator situation awareness is a major concern as a single mistake can lead to mission failure. Camera streams are widely used for teleoperation but offer limited field-of-view. In this paper, we present a flexible framework for virtual projections to increase situation awareness based on a novel method to fuse multiple cameras mounted anywhere on the robot. Moreover, we propose a complementary approach to improve scene understanding by fusing camera images and geometric 3D Lidar data to obtain a colorized point cloud. The implementation on a compact omnidirectional camera reduces system complexity considerably and solves multiple use-cases on a much smaller footprint compared to traditional approaches such as actuated pan-tilt units. Finally, we demonstrate the generality of the approach by application to the multi-camera system of the Boston Dynamics Spot. The software implementation is available as open-source ROS packages on the project page https://tu-darmstadt-ros-pkg.github.io/omnidirectional_vision.


AIhub coffee corner: Is AI-generated art devaluing the work of artists?

AIHub

This month, we tackle the topic of AI-generated art and what this means for artists. Joining the discussion this time are: Tom Dietterich (Oregon State University), Sabine Hauert (University of Bristol), Sarit Kraus (Bar-Ilan University), Michael Littman (Brown University), Lucy Smith (AIhub), Anna Tahovská (Czech Technical University), and Oskar von Stryk (Technische Universität Darmstadt). Sabine Hauert: This month our topic is AI-generated art. There are lots of questions relating to the value of the art that's generated by these AI systems, whether artists should be working with these tools, and whether that devalues the work that they do. Lucy Smith: I was interested in this case, whereby Shutterstock is now going to sell images created exclusively by OpenAI's DALL-E 2. They say that they're going to compensate the artists whose work they used in training the model, but I don't know how they are going to work out how much each training image has contributed to each created image that they sell.